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Abstract 

In this paper we give several new constructions of WOM codes. The novelty in our 
constructions is the use of the so called Wozencraft ensemble of linear codes. Specifi- 
cally, we obtain the following results. 

We give an explicit construction of a two-write Write-Once-Memory (WOM for 
short) code that approaches capacity, over the binary alphabet. More formally, for 
every e>0, 0<p<l and n = {l/e)^^^^^'''^ we give a construction of a two-write 
WOM code of length n and capacity H(j)) + 1 — p — e. Since the capacity of a two- 
write WOM code is maXp{H{p) + 1 — p), we get a code that is e-close to capacity. 
Furthermore, encoding and decoding can be done in time 0(n^ • poly(logn)) and time 
0(n • poly(logn)), respectively, and in logarithmic space. 

We obtain a new encoding scheme for 3- write WOM codes over the binary alphabet. 
Our scheme achieves rate 1.809 — e, when the block length is exp(l/e). This gives a 
better rate than what could be achieved using previous techniques. 

We highlight a connection to linear seeded extractors for bit-fixing sources. In 
particular we show that obtaining such an extractor with seed length O(logn) can 
lead to improved parameters for 2-write WOM codes. We then give an application of 
existing constructions of extractors to the problem of designing encoding schemes for 
memory with defects. 



1 Introduction 



In | |1IS82|| Rivest and Shamir introduced the notion of write- once-memory and showed its 



relevance to the problem of saving data on optical disks. A write-once-memory, over the 
binary alphabet, allows us to change the value of a memory cell (say from to 1) only 
once. Thus, if we wish to use the storage device for storing t messages in t rounds, then we 
need to come up with an encoding scheme that allows for t-write such that each memory 
cell is written at most one time. An encoding scheme satisfying these properties is called 
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a Write-Once-Memory code, or a WOM code for short. This model has recently gained 
renewed attention due to similar problems that arise when using flash memory devices. We 
refer the readers to ||YKS+l(i[] for a more detailed introduction to WOM codes and their use 



in encoding schemes for flash memory. 

One interesting goal concerning WOM codes is to flnd codes that have good rate for 
t-write. Namely, to flnd encoding schemes that allow to save the maximal information- 
theoretic amount of data possible under the write-once restriction. Following ||11S82|| it was 
shown that the capacity (i.e. maximal rate) of t-write binary WOM code isQ log(t + 1) (see 
liS82| , picc85| , |FVUU|). Stated differently, if we wish to use an n-cell memory t-times then 



each time we can store, on average, n ■ log(t + l)/t many bits. 

In this work we address the problem of designing WOM codes that achieve the theoretical 
capacity for the case of two rounds of writing to the memory cells. Before describing our 
results we give a formal deflnition of a two-write WOM code. 

For two vectors of the same length y and y' we say that y' < y ii y'l < yi for every 
coordinate i. 

Definition 1.1. A two-write binary WOM of length n over the sets of messages VLi and 0.2 
consists of two encoding functions £"1 : f2i — )■ {0, 1}"' and E2 : Ei{Qi) x Q2 ^ {0, 1}" and 
two decoding functions Di : EiiVLi) — )■ Vti and D2 : E2{Ei{yti) x ^2) — )■ ^2 that satisfy the 
following properties. 

1. For every x G Qi, Di{Ei{x)) = x. 

2. For every Xi G Vli and X2 G ^2, we have that Ei{xi) < E2{Ei{xi) , X2) ■ 

3. For every Xi G Qi and X2 G ^21 it holds that D2{E2{Ei{xi) , X2)) = X2. 

The rate of such a WOM code is defined to be (log + log |r22|)/'^- 

Intuitively, the deflnition enables the encoder to use Ei as the encoding function in the 
flrst round. If the message xi was encoded (as the string Ei{xi)) and then we wished to 
encode in the second round the message X2, then we write the string £'2(-E'i(xi), X2). Since 
Ei{xi) < E2{Ei{xi),X2), we only have to change a few zeros to ones in order to move from 
Ei{xi) to E2{Ei{xi),X2). The requirement on the decoding functions Di and D2 guarantees 
that at each round we can correctly decode the memory.^ Notice that in the second round we 
are only required to decode X2 and not the pair {xi,X2)- It is not hard to see that insisting 
on decoding both xi and X2 is a too strong requirement that does not allow rate more than 
1. 

The deflnition of a t-write code is similar and is left to the reader. Similarly, one can also 
deflne WOM codes over other alphabets, but in this paper we will only be interested in the 
binary alphabet. 



-'^AU logarithms in this paper arc taken base 2. 

^We imphcitly assume that the decoder knows, given a codeword, whether it was encoded in the first or 
in the second round. At worst this can add another bit to the encoding and has no affect (in the asymptotic 
sense) on the rate. 
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In [|liS82|| it was shown that the maximal rate (i.e. the capacity) that a WOM code can 



have is at most maXpif(p) 



P) where H{p) is the entropy function. It is not hard to 



prove that this expression is maximized for p = 1/3 and is equal to log 3. Currently, the best 
known explicit encoding scheme for two-write (over the binary alphabet) has rate roughly 

YKS'''10|. We note that these codes, of rate 
A more 'explicit' construction given 



1.49 (compared to the optimal log 3 ~ 1.585) 
1.49, were found using the help of a computer search 
in IIYKS^IOII achieves rate 1.46. 



Rivest and Shamir were also interested in the case where both rounds encode the same 
amount of information. That is, = |f22|- They showed that the rate of such codes is at 
most H{p) + 1 — p, for p such that H{p) = 1 — p {p ^ 0.227). Namely, the maximal possible 
rate is roughly 1.5458. Yaakobi et al. described a construction (with = 1^21) that has 
rate 1.375 and mentioned that using a computer search they found such a construction with 
rate 1.45 flYKS+iqi. 



1.1 Our results 

Our main theorem concerning 2-write WOM codes over the binary alphabet is the following. 

Theorem 1.1. For any e > 0, < p < 1 and c > there is N = N{e,p,c) such that for 
every n > N{e,p,c) there is an explicit construction of a two-write WOM code of length 
77,(1 + o(l)) of rate at least H{p) + 1 — p — e. Furthermore, the encoding function can be 
computed in time n'^^^ ■ poly(clogn) and decoding can he done in time n ■ poly(clog?T,). Both 
encoding and decoding can he done in logarithmic space. 

In particular, for j» = 1/3 we give a construction of a WOM code whose rate is e close 
to the capacity. If we wish to achieve a polynomial time encoding and decoding then our 
proof gives the bound N{e,p,c) = (ce)~^^^^^^'^^\ If we wish to have a short block length, i.e. 
n = poly(l/e), then our running time deteriorates and becomes n'^^^^'^\ 

In addition to giving a new approach for constructing capacity approaching WOM codes 
we also demonstrate a method to obtain capacity approaching codes from existing construc- 
tions (specifically, using the methods of [|YKS+10|| ) without storing huge lookup tables. We 



explain this scheme in Section |^. 

Using our techniques we obtain the following result for 3-write WOM codes over the 
binary alphabet. 

Theorem 1.2. For any e > 0, there is N = N{e) such that for every n > N{e,p, c) there is 
an explicit construction of a 3-write WOM code of length n that has rate larger than 1.809 — e. 

Previously the best construction of 3-write WOM codes over the binary alphabet had 
rate 1.61 ||KY5^T^]. Furthermore, the technique of ||KYS^l(i[| cannot provably yield codes 



that have rate larger than 1.661. Hence, our construction yields a higher rate than the 
best possible rate achievable by previous methods. However, we recall that the capacity of 
3-write WOM codes over the binary alphabet is log(3 + 1) = 2. Thus, even using our new 
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techniques we fall short of achieving the capacity for this case. The proof of this result is 
given in Section |^. 

In addition to the results above, we highlight a connection between schemes for 2-write 
WOM codes and extractors for bit-fixing sources, a combinatorial object that was studied 
in complexity theory (see Section ^ for definitions). We then use this connection to obtain 
new schemes for dealing with defective memory. This result is described in Section ^ (see 
Theorem |6.1| ) . 



1.2 Is the problem interesting? 

The first observation that one makes is that the problem of approaching capacity is, in some 
sense, trivial. This basically follows from the fact that concatenating WOM codes (in the 
sense of string concatenation) does not hurt any of their properties. Thus, if we can find, 
even in a brute force manner, a code of length m that is e-close to capacity, in time T[m), 
then concatenating n = T{m) copies of this code, gives a code of length nm whose encoding 
algorithm takes nT{m) = time. Notice however, that for the brute force algorithm, 
T(m) 2^"" and so, to get e-close to capacity we need m'^ 1/e and thus n «i 2^^^'. 

The same argument also shows that finding capacity approaching WOM codes for t- 
write, for any constant t, is "easy" to achieve in the asymptotic sense, with a polynomial 
time encoding/ decoding functions, given that one is willing to let the encoding length n be 
obscenely huge. 

In fact, following Rivest and Shamir, Heegard actually showed that a randomized encod- 



ing scheme can achieve capacity for all t ||Hcc85 



In view of that, our construction can be seen as giving a big improvement over the brute 
force construction. Indeed, we only require n ~ 2^/*^ and we give encoding and decoding 
schemes that can be implemented in logarithmic space. Furthermore, our construction is 
highly structured. This structure perhaps could be used to find "real-world" codes with 
applicable parameters. Even if not, the ideas that are used in our construction can be 
helpful in designing better WOM codes of reasonable lengths. 

We later discuss a connection with linear seeded extractors for hit-fixing sources. A 
small improvement to existing constructions could lead to capacity-achieving WOM codes 
of reasonable block length. 



1.3 Organization 



We start by describing the method of [|CGM86| , [WuTO| , |YKS+10|| in Section | as it uses similar 
ideas to our construction. We then give an overview of our construction in Section |^ and the 
actual construction and its analysis in Section ^. In Section ^ we discuss the connection to 
extractors and then show the applicability of extractors for dealing with defective memories 
in Section ^. In Section |^ we show how one can use the basic approach of ||YKS'''10|] to 
achieve capacity approaching WOM codes that do not need large lookup tables. Finally, we 
prove Theorem lO in Section P. 
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1.4 Notation 



For a. k X m matrix A and a subset S C [m] 
contains only the columns that appear in 5*. 
we denote with y\s the vector that is equal 
zeros outside S. 



we let A\s be the /c x l^l sub matrix of A that 
For a length m vector y and a subset 5* C [m] 
to y on all the coordinates in 5* and that has 



2 The construction of [| CGM86 k i WulO l , | YKS ' 10 |] 



As it turns out, our construction is related to the construction of WOM codes of Cohen et al. 
CGM86II as well as to that of Wu [ |WulO|| and of Yaakobi et al. [|YKS+10|| .P| We describe the 
idea behind the construction of Yaakobi et al. next (the constructions of [ |CGM86| , |Wul O 
are similar). Let < p < 1 be some fixed number. 



Similarly to ||RS82|| , in the first round [|YKS^10|| think of a message as a subset S C [n] 
of size pn and encode it by its characteristic vector. Clearly in this step we can transmit 
H{p)n bits of information. (I.e. log \ ~ H{p)n.) 

For the second round assume that we already send a message 5* C [n]. I.e. we have 
already written pn locations. Note that in order to match the capacity we should find a 
way to optimally use the remaining (1 — p)n locations in order to transmit {1 — p — o{l))n 
many bits. Imagine that we have a binary MDS code. Such codes of course do not exist 
but for the sake of explanations it will be useful to assume their existence. Recall that a 
linear MDS code of rate n — k can be described hy a. k x n parity check matrix A having the 
property that any k columns have full rank. I.e. any k x k submatrix of A has full rank. 
Such matrices exist over large fields (i.e. parity check matrices of Reed-Solomon codes) but 
they do not exist over small fields. Nevertheless, assume that we have such a matrix A that 
has (1 —p)n rows. Further, assume that in the first round we transmitted a word w G {0, 1}" 
of weight \w\ = pn representing a set S. Given a message x E {0, we find the unique 

y G {0, 1}" such that Ay = x and y\s = w. Notice that the fact that each (1 —p)n x (1 —p)n 
submatrix of A has full rank guarantees the existence of such a y. Our encoding of x will 
be the vector y. When the decoder receives a message y in order to recover x she simply 
computes Ay. As we did not touch the nonzero coordinates of w this is a WOM encoding 
scheme. 

As such matrices A do not exist, Yaakobi et al. look for matrices that have many 
submatrices of size (1 — p)n x (1 — p)n that are full rank and restrict their attention only 
to sets S such that the set of columns corresponding to the complement of S has full rank. 
(I.e. they modify the first round of transmission.) In principal, this makes the encoding of 
the first round highly non-efficient as one needs a lookup table in order to store the encoding 
scheme. However, |[YKS^ld[| showed that such a construction has the ability to approach 
capacity. For example, if the matrix A is randomly chosen among all {1 — p)n x n binary 
matrices then the number of {1 — p)n x [1 — p)n submatrices of A that have full rank is 



Cohen et al. first did it for i > 2 and then Wu used it for t 
by Yaakobi et aL 



2. Wu's ideas were then shghtly refined 
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roughly 2^^^. 



Remark 2.1. Similar to the concerns raised in Section Lk, this method (i.e. picking a 
random matrix, verifying that it has the required properties and encoding the "good" sets of 
columns) requires high running time in order to get codes that are e-close to capacity. In 
particular, one has to go over all matrices of dimension, roughly, 1/e x 0(l/e) in order to 
find a good matrix which takes time exp(l/e^). Furthermore, the encoding scheme requires 
a lookup table whose space complexity is exp(l/e). Thus, even if we use the observation 
raised in Section and concatenate several copies of this construction in order to reach a 
polynomial time encoding scheme, it will still require a large space. (And the block length 
will even be slightly larger than in our construction.) 

Nevertheless, in Section we show how one can trade space for computation. In other 
words, we show how one can approach capacity using this approach without the need to store 
huge lookup tables. 



3 Our method 



We describe our technique for proving Theorem |1 . 1| . The main idea is that we can use a 
collection of binary codes that are, in some sense, MDS codes on average. Namely, we show 
a collection of (less than) 2™ matrices {Ai} of size {1—p — e)m x m such that for any subset 
S C [m], of size pm, all but a fraction 2"'^™ of the matrices Ai, satisfy that ^j|[m]\5 has full 
row rank (i.e. rank (1—p — e)m). Now, assume that in the first round we transmitted a 
word w corresponding to a subset S C [m] of size pm. In the second round we find a matrix 
Ai such that Ai|[m]\s' has full row rank. We then use the same encoding scheme as before. 
However, as the receiver does not know which matrix we used for the encoding, we also send 
the "name" of the matrix alongside our message (using additional m bits). 

This idea has several drawbacks. First, to find the good matrix we have to check exp(m) 
many matrices which takes a long time. Secondly, sending the name of the matrix that we 
use require additional m bits which makes the construction very far from achieving capacity. 

To overcome both issues we note that we can in fact use the same matrix for many 
different words w. However, instead of restricting our attention to only one matrix and the 
sets of w's that is good for it, as was done in ||YKS"^10|], we change the encoding in the 



following way. Let M = m ■ 2'^"*. In the first step we think of each message as a collection 
of M/m subsets Si, ... , Sm/th C [m], each of size pm. Again we represent each Si using a 
length m binary vector of weight pm, Wi. We now let w = Wi o W2 o . . . o WM/m, where aob 
stands for string concatenation. For the second stage of the construction we find, for a given 
transmitted word w G {0, 1}*^, a matrix A from our collection such that all the matrices ^5. 
have full rank. Since, for each set S only 2"^"^ of the matrices are "bad" , we are guaranteed, 
by the union bound, that such a good matrix exists in our collection. Notice that finding 
the matrix requires time poly(M,2'") = M^^^/'K Now, given a length {1 — p — e)M string 
X = Xi o . . . o XM/m represented as the concatenation of M/m strings of length {1 — p — e)m 
each, we find for each Wi a word yi G {0, 1}™ such that Ayi = Xi and yils^ = Wi. Our 
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encoding of x is ?/i o . . . o yM/m ° where by I{A) we mean the length m string that 
serves as the index of A. Observe that this time sending the index of A has almost no 
effect on the rate (the encoding length is M = exp(m) and the "name" of A consists of at 
most m bits). Furthermore, the number of messages that we encode in the first round is 
equal to {^Y'^^"" = 2(^(p)-°(i))'"-^^/'" = 2(^(p)-°(i))*'^ In the second round we clearly send 
an additional {1 —p — e)M bits and so we achieve rate H{p) + (1 — p — e) — o(l) as required. 

However, there is still one drawback which is the fact that the encoding requires M^/^ 
time. To handle this we note that we can simply concatenate M^/"^ copies of this basic 
construction to get a construction of length n = M^~^^^^ having the same rate, such that now 
encoding requires time M'^^^^'^^ = poly(n). 

We later use a similar approach, in combination with the Rivest-Shamir encoding scheme, 
to prove Theorem |1.2| . 



4 Capacity achieving 2-write WOM codes 
4.1 Wozencraft ensemble 

We first discuss the construction known as Wozencraft's ensemble. This will constitute our 
set of "average" binary MDS codes. 

The Wozencraft ensemble consists of a set of 2" binary codes of block length 2n and 
rate 1/2 (i.e. dimension n) such that most codes in the family meet the Gilbert- Varshamov 
bound. To the best of our knowledge, the construction known as Wozencraft's ensemble first 
appeared in a paper by Massey ||Mas63|| . It later appeared in a paper of Justesen ||Jus72 



that showed how to construct codes that achieve the Zyablov bound ||Zya71 . 

Let fc be a positive integer and F = F2fe be the field with 2^ elements. We fix some 
canonical invertible linear map ak between F and F2 and from this point on we think of each 
element x G F both as a field element and as a binary vector of length k, which we denote 
ak{x). Let 6 > be an integer. Denote Hb ■ {0, 1}* — j- {0, 1}'' be the map that projects each 
binary sequence on its first b coordinates. 

For two integers < b < k, the {k, k + 6)-Wozencraft ensemble is the following collec- 
tion of 2^ matrices. For a E ¥ denote by A^ the unique matrix satisfying a^ix) ■ A^ = 
{ak{x),7ih{(Tk{ax))) for every x e¥. 

The following lemma is well known. For completeness we provide the proof below. 

Lemma 4.1. For any y E {0, l}*"'"*"^ the number of matrices A^ that y is contained in 
the span of their rows is exactly 2^~^ . 

Proof. Let us first consider the case where b = k, i.e., that we keep all of ak{ax). In 
this case ak{x) ■ A^ = {ak{x),ak{ax)). Given a ^ 13 and x,y E {0, l}*^ notice that if 
ak{x) ■ Aa = cTkiy) ■ Aa then it must be the case that ak{x) = akiy) and hence x = y. Now, 
if X = y and ^ x then since a ^ f3 we have that ax 7^ /3x = Py. It follows that the only 
common vector in the span of the rows of Aa and Ajs is the zero vector (corresponding to 
the case x = 0). 
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Now, let use assume that b < k. Fix some a G F and let ((Tfc(x), 7r6((Tfc(ax))) be some 
nonzero vector spanned by the rows of A^- For any vector u G {0, 1}^"^ let /S^j G F be the 
unique element satisfying ak{(3ux) = 7ij,{ak{(yx)) o u. Notice that such a /3„ exists and equal 
to (3u = a^~^\nh{ak{cex)) o u) ■ (x 7^ as we started from a nonzero vector in the row 
space of Aa). We thus have that ak{x) ■ Ap^ = {ak{x),7rb{ak{Pux))) = {ak{x) , 7ib{cTk{ax))) . 
Hence, {ak{x),7ib{crk{ax))) is also contained in the row space of A/s^. Since this was true for 
any u G {0, l}'^, and clearly for u 7^ u', 7^ f3u' we see that any such row is contained in 
the row space of exactly 2''~^ matrices A^. 

It is now also clear that there is no additional matrix that contains {ak{x),'Kh{(Jk{ax))) 
in its row space. Indeed, if A^ is a matrix containing the vector in its row space, then let 
u be the last k — b bits of ak{'yx). It now follows that ak{'yx) = <Jk{f3ux) and since cr^ is an 
invertible linear map and x 7^ this implies that 7 = □ 

Corollary 4.2. Let y G {0, l}'^^^ have weight s. Then, the number of matrices in the 
{k, k + b)-Wozencraft ensemble that contain a vector ^ y' < y in the span of their rows is 
at most (2" - 1) ■ 2''"^ < 2'=+''-^ 

To see why this corollary is relevant we prove the following easy lemma. 

Lemma 4.3. Let A be a kx (k + b) matrix of full row rank ( i.e. rank(74) = k) and S C [k + b] 
a set of columns. Then As has full row rank if and only if there is no vector y ^ supported 
on [k + b] \ S that is in the span of the rows of A. 

Proof. Assume that there is a nonzero vector y in the row space of A that is supported on 
[k + b]\ S. Hence, it must be the case that xAs = 0. Since x 7^ 0, this means that the rows 
of As are linearly dependent and hence As does not have full row rank. 

To prove the other direction notice that if Ta.nk{As) < k then there must be a nonzero 
X G {0, 1}^ such that xAs = 0. Since A has full row rank it is also the case that xA 7^ 0. 
We can thus conclude that xA is supported on [/c + 5] \ 5" as required. □ 

Corollary 4.4. For any S G [k + b] of size \S\ < (1 — e)b, the number of matrices A in 
the {k, k + b)-Wozencraft ensemble that Ayk+b]\s does not have full row rank is smaller than 

Q^k—eb 

Proof. Let y be the characteristic vector of 5*. In particular, the wight of ?/ is < (1 — e)6. By 



Corollary 4.2, the number of matrices that contain a vector ^ y' < y in the span of their 



rows is at most (2^^ — 1) ■ 2'' < 2^ By Lemma we see that any other matrix in 



the ensemble has full row rank when we restrict to the columns in [/c + &] \ S*. □ 
4.2 The construction 

Let c, e > and < p < 1 be real numbers. Let n be such that 

\ogn<n''''^ and 8/e < n(P+^/2)c._ 
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Notice that n = satisfies this condition. Let k = (1 — p — e/2) ■ clogn, b = 

(p + e/2) ■ clogn and 

I = k-— — = 1 -p-e 2)^. 

(clog 72)2^^ ^ ^ ' '2^^ 

To simplify notation assume that A;, h and / are integers. 

Our encoding scheme will yield a WOM code of length n + /, which, by the choice of n, 
is at most n + / < (1 + e/8)n, and rate larger than -ff(p) + (1 — j>) — e. 



Step I. A message in the first round consists of n/(clogn) subsets 5*1, . . . , S'„/(ciogn) C 
[c log n] of size at most p ■ (c log n) each. We encode each Si using its characteristic vector Wi 
and denote u; = u;i o o . . . o Wni(c\o^n) o 0/, where 0/ is the zero vector of length /. Reading 
the message ^i, . . . , Sni(c\o^n) from w is trivial. 



Step II. Let X = x\ox2^ ■ ■ - ^ a^n/(ciogn) be a concatenation of ?7./(c log n) vectors of length 
k = (1 — p — e/2)clogn each. Assume that in the first step we transmitted a word w 
corresponding to the message {Si, . . . , Sn/(ciogn)) and that we wish to encode the message x 
in the second step. For each 1 < i < , , " we do the following. 

^ — — (clog 71)2"^" 

Step II. i. Find a matrix in the [k, k + 6)-Wozencraft ensemble such that for each 
(i — 1)2^^ + 1 < j < i2'^^ the submatrix {Aa)[c\ogn]\s- has full row rank. Note that Corollary [4. 4| 
guarantees that such a matrix exists. Denote this required matrix by Ac^^. 

Step Il.ii. For {i - 1)2^^ + 1 < j < i2'^ find a vector e {0, 1}^+^ = {0, g^^j^ 
that Aa^Uj = Xj and yj\s- = Wj. Such a vector exists by the choice of Aa^. The encoding of x 
is the vector yioy2° ■ ■ • oyn/(ciogn) ° • • -ocr/cla 2 ). Observe that the length of the 

(c log n)2^" 

encoding is clog(n) ■ n/ (clog(n)) + k ■ ^^j^^"^^^,^ = n + I. Notice that given such an encoding 

we can recover x in the following way. Given (z — 1)2^'' + 1 < j < i2'^^ set Xj = Aa^yj, where 
Oi is trivially read from the last / bits of the encoding. 



4.3 Analysis 

Rate. From Stirling's formula it follows that the number of messages transmitted in Step 
I. is at least (2^(p)^i°g«-i°gi°g")"/(ci°g") = 2^^(p)«-"i°gi°g"/(ci°g"). In Step II. it is clear that 
we encode all messages of length kn/ (clogn) = {1 — p — e/2)n. Thus, the total rate is 

{{H{p) — log log n/ (clogn)) + (1 — p — e /2))n / {n + I) 
>{{H{p) - loglogn/(clogn)) + {1 - p - e/2))(l - e/8) 
>{H (p) + 1 — p) — e log2(3)/8 — e/2 — log log n/ (clogn) 
>H{p) + l-p-e, 

where in the second inequality we used the fact that maxp(if(p) + 1 — p) = logg 3. The last 
inequality follows since logn < n^'^^'^. 
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Complexity. The encoding and decoding in the first step are clearly done in polynomial 
time.0 

In the second step, we have to find a "good" matrix A^. for all sets Sj such that {i — 
1)2^'' + 1 < J < As there are 2'^'°^"' = n'^ matrices and each has size k x clog?7,, we 
can easily compute for each of them whether it has full row rank for the set of columns 
[clogn] \ Sj. Thus, given i, we can find A^^ in time at most 2*^^ ■ ■ poly(clogn). Thus, 
finding all A^^ takes at most 

(T^ ■ n'^ ■ poly(clogn)) = rf'^^ ■ poly(clogn). 



(clogn)2 



Given Aa^ and Wj, finding Uj amounts to solving a system of k linear equations in (at most) 
clogn variables which can be done in time poly (c log ra). It is also clear that computing 
(Tfc(Q!i) requires poly (c log n) time. Thus, the overall complexity is n'^^^ ■ poly (c log n). 
Decoding is performed by multiplying each of the Aq,. by 2*^^ vectors so the decoding 
complexity is at most ^^^^^" ^^5 ■ 2'^ ■ poly (c log ra) = n ■ poly (c log n). 



Theorem 1.1 is an immediate corollary of the above construction and analysis. 



5 Connection to extractors for bit-fixing sources 

Currently, our construction is not very practical because of the large encoding length required 
to approach capacity. It is an interesting question to come with "sensible" capacity achieving 
codes. One approach would be to find, for each n, a set of poly(n) matrices {Ai\ of dimensions 
(1— j9 — e)nxn such that for each set S C [n] of size 15*1 = (1 — p)n there is at least one 
Ai such that Ai\s has full row rank. Using our ideas one immediately gets a code that is 
(roughly) e-close to capacity. 

One way to try and achieve this goal may be to improve known constructions of seeded 
linear extractors for bit-fixing sources. An [n, k) bit-fixing source is a uniform distribution on 
all strings of the form {v G {0, 1}" \ vg = a} for some S C [n] of size n — k and a G {0, 1}"'"'^. 
We call such a source {S, a)-source. 

Roughly, a seeded linear extractor for (n, k) sources that extracts k — o{k) of the entropy, 
with a seed length d, can be viewed as a set of 2'^ matrices of dimension {k — o{k)) x n such 
that for each S C [n] of size l^l = n — A;, a 1 — e fraction of the matrices Ai satisfy 
has full row rank.[] 

Definition 5.1. A function E : {0, 1}" x {0, l}'^ — ?■ {0, 1}™ is said to be a strong linear 
seeded {k, e)- extractor for bit fixing sources if the following properties holds. ^ 

• For every r G {0, l}'^, E{-,r) : {0, 1}" — > {0, l}'" is a linear function. 



"^We do not explain how to encode sets as binary vectors but this is quite easy and clear. 
^Here we use the assumed linearity of the extractor. 

^We do not give the most general definition, but rather a definition that is enough for our needs. For a 



more general definition see Rao07 |. 
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• For every (n, k)-source X, the distribution E{X,r) is equal to the uniform distribution 
on {0, 1}'" for (1 — e) of the seeds r. 

Roughly, a seeded linear extractor for {n, k) sources that extracts k — o{k) of the entropy, 
with a seed length (i, can be viewed as a set of 2"^ matrices of dimension {k — o{k)) x n such 
that for each S C [n] of size \S\ = n — k, 1 — eof the matrices Ai satisfy 74j|[„]\5 has full 
row rank.[] Note that this is a stronger requirement than what we need, as we would be fine 
also if there was one Ai with this property. Currently, the best construction of seeded linear 
extractors for (n, k)-hit fixing sources is given in ||RRV02|| , following |P^rc01|| , and has a seed 



length d = O(log'^n). We also refer the reader to ||Rao09|| where linear seeded extractors for 
affine sources are discussed. 

Theorem 5.1 ( ||RRV02| ). For every n,k eN and e > 0, there is an explicit strong seeded 
{k,e)- extractor Ext : {0, 1}" x {0, l}'^ {0, i}fe-oa°g'W^))^ with d = 0(log^(n/e)). 



In the next section we show how one can use the result of |[RRV02|| in order to design 
encoding schemes for defective memory. 

Going back to our problem, we note that if one could get an extractor for bit-fixing 
sources with seed length d = O(logn) then this will give the required poly(n) matrices and 
potentially yield a "reasonable" construction of a capacity achieving two-write WOM code. 

Another relaxation of extractors for bit-fixing sources is to construct a set of matrices of 
dimension (1 — p — e)n x n, A, such that |^| can be as large as |^| = exp(o(n)), and that 
satisfy that given an (S", a) -source we can efficiently find a matrix A E A such that ^|[n]\5 
has full row rank. It is not hard to see that such a set also gives rise to a capacity achieving 
WOM codes using a construction similar to ours. Possibly, such A could be constructed to 
give more effective WOM codes. In fact, it may even be the case that one could "massage" 
existing constructions of seeded extractors for bit-fixing sources so that given an {S, a)-source 
a "good" seed can be efficiently found. 



6 Memory with defects 

In this section we demonstrate how the ideas raise in Section || can be used to handle defective 
memory. 

A memory containing n cells is said to have pn defects if pn of the memory cells have some 
value stored on them that cannot be changed. We will assume that the person storing data 
in the memory is aware of the defects, yet the person reading the memory cannot distinguish 
a defective cell from a proper cell. 

The main question concerning defective memory is to find a scheme for storing as much 
information as possible that can be retrieved efficiently, no matter where the pn defects are. 

We will demonstrate a method for dealing with defects that is based on linear extrac- 
tors for bit fixing sources. To make the scheme work we will need to make an additional 
assumption: 

^Here we use the assumed linearity of the extractor. 
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Our assumption: We shall assume that the memory contains O(log^n) cells that are 
undamaged and whose identity is known to both the writer and the reader. 

We think that our assumption, although not standard is very reasonable. For example, we 
can think of having a very small and expensive chunk of memory that is highly reliable and 
a larger memory that is not as reliable. 

The encoding scheme Our scheme will be randomized in nature. The idea is that each 
memory with k = pn defects naturally defines an [n, A;)-source, X, that is determined by the 
values in the defective cells. Consider the extractor Ext guaranteed by Theorem |5.1| . We 
have that for (1 — e) fraction of the seeds r, the linear map Ext : X — )■ {0, l}'=~'^(^°s ("/<:)) has 
full rank, (as it induces the uniform distribution on {0, ("-/e))/) jj-^ particular, given 

a string y E {0, l}(i-p)"'-o(iog (n/e))^ pick a seed r G 0(log'^(n/e)) at random, then with 
probability at least (1 — e) there will be an x G X such that Ext(x, r) = y. 

Thus, our randomized encoding scheme will work as follows. Given the defects, we 
define the source X (which is simply the affine space of all n-bit strings that have the 
same value in the relevant coordinates as the defective memory cells). Given a string 
y E {0, ("/^)) that we wish to store to the memory, we will pick at random 

r E {0, l}"^, for d = 0(log^(n/e)), and check whether Ext : X ^ {0, l}fc-o(i°g'("A)) has full 
rank. This will be the case with probability at least 1 — e. Once we have found such r, we 
find X E X with Ext(x, r) = y. As x E X and X is "consistent" with the pattern of defects, 
we can write x to the memory. Finally, we write r in the "clean" 0(log'^(r2/e)) memory cells 
that we assumed to have. 

The reader in turn, will read the memory x and then r and will recover y by simply 
computing Ext(x, r). 

In conclusion, for any constant^ p < 1 the encoding scheme described above needs 
O(log'^n) clean memory cells, and then it can store as much as {1 — p — S)n bits for any 
constant 6 > 0.^ 

We summarize this result in the following theorem. 

Theorem 6.1. For any constant p < 1 there is a randomized encoding scheme that given 
access to a defective memory of length n containing pn defective cells, uses 0{\o^ n) clean 
memory cells, and can store {1 — p — 5)n hits for any constant 5 > 0. 

The encoding and decoding times for the scheme are polynomial in n and 1/5. 



7 Approaching capacity without lookup tables 



In this section we describe how one can use the techniques of [|CGM86| , |WulO| , |YKS+10|| in 
order to achieve codes that approach capacity without paying the cost of storing huge lookup 



*The scheme can in fact work also when p = 1 — o(l), and this can be easily deduced from the above, but 
we present here the case oi p < 1. 

^Again, we can take 5 = o(l) but we leave this to the interested reader. 
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tables. The reader is referred to Section ^ for a summary of the basic approach. We will 
give a self contained treatment here. 

Let < p < 1 and e be real numbers. Let A be a (1 — p)m x m matrix that has the 
following property 

Main property of A: 

For (1 — e) fraction of the subsets S C [m] of size pm it holds that has full rank. 



Recall that this is exactly the property that is required by ||CGM86| , |WulCI| , |YKS'^10" | 



However, while in those works a lookup table was needed we will show how to trade space 
for computation and in particular, our encoding scheme will only need to store the matrix 
A itself (whose size is logarithmic in the size of the lookup table). 



The encoding scheme Let S = (j^*^) . In words, S is the collection of all subsets of [m] of 
size pm. We denote (j = |S| = (™). Let N = a ■ m. We will construct an encoding scheme 
for memory cells. 

We denote with C S (g stands for "good") the subset of S containing all those sets 
S for which Alj^j^^ has full rank. We also denote ag = |Eg| > (1 — e)cr. 

We let V = {0, \ [A ■ 1} be the set of vectors of length (1 — p)m that contains 

all vectors except the vector A ■ 1. Clearly \V\ = 2^-'^"^)™ — 1. 

The first round: A message will be an equidistributedQ word in E'^. Namely, it will 
consist of all a subsets of [m] of size pm each, such that each subset appears exactly once. 
We denote this word as w = Wi o W2 o . . . o Wo- where Wi e E. (alternatively, a word is a 
permutation of [a].) 

To write w to the memory we will view the N cells as a collection of a groups of m cells 
each. We will write the characteristic vector of Wi to the m bits of zth group. 



The second round: A message in the second round consists of ag vectors from V. 
That is, X = xi o . . . o x^g, where Xi G V. 

To write x to memory we first go over all the memory cells and check which coordinates 
belong to Eg. According to our scheme there are exactly ag such m-tuples. Consider the 
ith m-tuple that belongs to E^. Assume that it encodes the subset S C [m] (recall that 
\S\ = pm). Let ws be its characteristic vector, (note that this m-tuple stores ws-) We will 
find the unique y G {0, 1}™ \ 1 such that Ay = Xi and y\s = ws- Such a y exists since A\ [m]\s 
has full rank. 

After writing x to memory in this way, we change the value of the other a — ag m-tuples 
to 1111...L Namely, whenever an m-tuple stored a set not from E^ we change its value in 
the second write to 1. 

^'^From here on we use the term 'equidistributed' to denote words that contain each symbol of the alphabet 
the same number of times. 
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Recovering x is quite easy. We ignore all m-tuples that contain the all 1 vector. We are 
thus left with (jg m-tuples. If yi is the m-bit vector stored at the ith "good" m-tuple then 



Analysis The rate of the first round is 



log(a!) _ log(a!) _ log(a) _ ^^^^ ^ log _ ^, 1 

m 



_ ^^^^ ^ _ ^^^^ ^ ^^^^ _ ^p^^ 



m 



m 



N ma m 

In the second round we get rate 

log((2(i-P)'^ - lyo) _ ag ■ log(2(i-P)™ - 1)) _ (1 - e) log(2(i~P)™ - 1)) 



1 



N 



am 



m 



= (1 -e)(l -p) -0(exp(-(l -p)m)). 

Hence, the overall rate of our construction is 

H{p) + [1 - e){l - p) + 0{l/m). 



Notice that the construction of |[YKS'''10| gives rate log(l — e) + H{p) + {1 — p). Thus, the 
loss of our construction is at most 

ep + 0(l/m) - log(l - e) = 0(e + 1/m). 



Note, that if [ |YKIS^lO| get e close to capacity then we must have m = poly(l/e) and so our 
codes get 0(e) close to capacity. To see that it must be the case that m = poly(l/e) we 
note that by probabilistic argument it is not hard to show that, say, ag < a/2. Thus, the 
rate achieved by |[YKS^ld[| is at most H{p) + (1 — p) — 1/m, and so to be e-close to capacity 



(which is maXp{H{p) + (1 — p)), we must have m > 1/e. 

Concluding, our scheme enables a tradeoff: for the ||YKS"^10| scheme to be e-close to 
capacity we need m = poly(l/e) and therefore the size of the lookup table that they need 
to store is exp(l/e). In our scheme, the block length is exp(l/e) (compared to poly(l/e) in 
YKS^lOf] ), but we do not need to store a lookup table. 



8 3-write binary WOM codes 

In this section we give an asymptotic construction of a 3-write WOM code over the binary 
alphabet that achieves rate larger than 1.809 — e. Currently, the best known methods give 
rate 1.61 [|KYS^10|| and provably cannot yield rate better than 1.661. The main drawback 



of our construction is that the block length has to be very large in order to approach this 
rate. Namely, to be e close to the rate the block length has to be exponentially large in 1/e. 

An important ingredient in our construction is a 2-write binary WOM code due to Rivest 
and Shamir ||RS82|| that we recall next. The block length of the Rivest-Shamir construction 
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Symbol 


weight 0/1 


weight 2/3 





000 


111 


1 


001 


110 


2 


010 


101 


3 


100 


Oil 



Table 1: The Ri vest- Shamir encoding 



is 3 and the rate is 4/3. In each round we write one of four symbols {0, 1,2,3} which are 
encoded as follows. 

In the first round we write for each symbol the value in the 'weight 0/1' column. In the 
second round we use for each symbol, the minimal possible weight representing it and that is 
a 'legal' write. For example, if in the first round the symbol was 2 and at the second round 
it was 1 then we first write 010 and then 110. On the other hand, if in the first round the 
symbol was and in the second round it was 1 then we first write 000 and then 001. 

The basic idea. We now describe our approach for constructing a 3-write WOM code. 
Let n and m be integers such that n = 12m. We shall construct a code with block length 
n. We first partition the n cells to 4m groups of 3 cells each. A message in the first round 
corresponds to a word wi G {0, 1,2,3}^"^ such that each symbol appears in wi exactly m 
times, (we will later "play" with this distribution.) We encode Wi using the Rivest-Shamir 
scheme, where we use the ith triplet to encode The second round is the same as the 

first round. I.e. we get W2 G {0,1,2,3}^"^ that is equidistributed and write it using the 
Rivest-Shamir scheme. 

Before we describe the third round let us calculate an upper bound on the number of 
memory cells that have value 1, i.e., those cells that we cannot use in the third write. 

Notice that according to the Rivest-Shamir encoding scheme, a triplet of cells (among 
the 4m triplets) stores 111 if and only if, in the first round it stored a symbol from {1, 2, 3} 
and in the second round it stored a zero. Similarly, a triplet has weight 2 only if in both 
rounds it stored a symbol from {1,2,3}. We also note, that a triplet that stored zero in 
the first round, will store a word of weight at most one after the second write. Since in the 
second round we had only m zeros and in the first round we wrote only 3m values different 
than zero, the weight of the stored word is at most 

m X 3 + (3m — m)x2 + mxl = 8m = 2n/3. 

Thus, we still have n/3 zeros that we can potentially use in the third write. We can now use 
the same idea as in the construction of capacity achieving 2-write WOM codes and with the 
help of the Wozencraft ensemble achieve rate (1/3 — o(l)) for the third write.[^ Thus, the 
overall rate of this construction is 2/3 + 2/3 + 1/3 — o(l) = 5/3 — o(l). As before, in order 

^^This step actually involves concatenating many copies of the construction with itself to achieve reasonable 
running time, and as a result the block length blows to exp(l/e). 
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to be e-close to 5/3 we need to take n = exp(l/e). Note that this idea aheady yields codes 
that beat the best possible rate one can hope to achieve using the methods of Kayser et al. 



Improvement I. One improvement can be achieved by modifying the distribution of 
symbols in the messages of the first round. Specifically, let us only consider messages 
Wi G {0,1,2,3}^™ that have at least 4pm zeros (for some parameter p). The rate of the 
first round is thus {l/3){H{p) + (1 — p)log(3)). In the second round we again write an 
equidistributed word W2- Calculating, we get that the number of nonzero memory cells after 
the second write is at most 

m X 3 + (4(1 — p)m — m) x 2 + 4pm x 1 = 9m — 4pm . 

Thus, in the third round we can achieve rate ^"^^^"^ — o(l) = p/3 + 1/4 — o(l). Hence, the 
overall rate is 

(1/3) ■ {H{p) + (1 - p) log(3)) + (2/3) + (p/3 + 1/4) - o(l) . 
Maximizing over p we get rate larger than 1.69 when p = 2/5. 



Improvement II. Note that so far we always assumed that the worst had happened, i.e., 
that all the zero symbols of W2 were assigned to cells that stored a value among {1,2,3}. 
We now show how one can assume that the "average" case has happened using the aid of 
two additional memory cells. 

Let n = 12m and N = n + 2. As before, let p be a parameter to be determined later. 
A message in the first round is some wi G {0, 1, 2, 3}^™ that has at least 4pm zeros. Again, 
we use the Rivest-Shamir encoding to store Wi on the first n memory cells. We define the 
set I = {i \ {wi)i 7^ 0}. Notice that |/| < 4(1 — p)m. In the second round we get a word 
W2 G {0,1,2,3}^'" which is equidistributed. We identify an element a G {0,1,2,3} that 
appears the least number of times in {w2)\i- I.e., it is the symbol that is repeated the least 
number of times in W2 when we only consider those coordinates in /. We would like this a 
to be but this is not necessarily the case. So, to overcome this we change the meaning of 
the symbols of W2 in the following way: We write a in the last two memory cells (say, using 
its binary representation) and define a new word w'2 G {0, 1, 2, 3}^"^ from W2 by replacing 
each appearance of zero with a and vice versa. We now use the Rivest-Shamir encoding 
scheme to store It is clear that we can recover w'2 and a from the stored information 
and therefore we can also recover W2 (by replacing and a). The advantage of this trick is 
that the weight of the stored word is at most 

1 3 13 

- ■ 4(1 — p)m X 3 + - -4(1 — p)m x 2 + - ■ 4pm x + - ■4pm x 1 = (9 — 6p)m = (3/4 — p/2)?T, . 

Indeed, in w'2 the value zero appears in at most |/|/4 of the cells in /. Thus, at most 
I ■ 4(1 — p)m triplets will have the value 111. Moreover, the rest of the zeros (remember 
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that W2 had exactly m zeros) will have to be stored in triplets that already contain the zero 
triplet so they will leave those cells unchanged (and of weight zero). As a result, in the third 
round we will be able to store (1/4 + p/2)n — o(n) bits (this is the number of untouched 
memory cells after the second round). To summarize, the rate that we get isQ 

(1/3) ■ (Hip) + {l-p) log(3)) + (2/3) + (1/4 + p/2) - o(l) . 

Maximizing over p we get that for p ^ 0.485 the rate is larger than 1.76. 

Improvement III. The last improvement comes from noticing that so far we assumed 
that all the triplets that had weight 1 after the first write, have weight at least 2 after the 
second write. This can be taken care of by further permuting some of the values of W2- 
Towards this goal we shall make use of the following notation. For a word w G {0, 1, 2, 3}^"* 
let 

Io{w) = {i I {wi)i 7^ and Wi = 0} 

and 

/=(w) = {i I {wi)i ^ and Wi = (wi)i} . 

For a permutation tt : {0, 1, 2, 3} — {0, 1, 2, 3} define the word to be {w-„)i = 7i{{w)i). 

Let n = 12m and N = n + b. As before, let p be a parameter to be determined later. A 
message in the first round is some wi G {0, 1,2,3}^™' that has at least Apm zeros. We use 
the Rivest-Shamir encoding scheme to store wi on the first n memory cells. A message for 
the second write is W2 G {0,1,2,3}'^'". We now look for a permutations vr : {0,1,2,3} — )■ 
{0,1,2,3} such that \h{w^)\ < | ■ 4(1 - p)m = (1 - p)m and |/=(u;^)| > | ■ 4(1 - p)m = 
(1 —p)m. Observe that such a tt always exists. Indeed, as before we can first find 7r^^(0) by 
looking for the value that appears the least number of times in W2 on the coordinates where 
wi is not zero. Let us denote this value with a. We now consider only permutations that 
send a to 0. After we apply this transformation to W2 (namely, switch between a and 0) we 
denote the resulting word by w'2- Let J = {i\ {wi)i 7^ and ("^2)1 7^ 0}. I.e., J is the set of 
coordinates that we need to consider in order to satisfy \I={w.,^)\ > (1 — p)m. By the choice 
of a we get that \J\ > 4(1 —p)m — ^ ■ 4(1 —p)m = 3(1 —p)m. Now, among all permutations 
that send a to zero, let us pick one at random and compute the expected size \I={{w'2)it)\- 
Notice, that when picking a permutation at random the probability that a coordinate i & J, 
will satisfy {wi)i = {{102)^)1 is exactly 1/3. Thus, the expected number of coordinates in J 
that fall into I={{w2)n) is 1^1/3- In particular there exists a permutation tt that achieves 
|/=((w^2)7r)| > l-'^l/S > 3(1 —p)m/3 = (1 ~p)m. Let ttq be this permutation. We use the last 
5 memory cells to encode ttq. As there are 4! = 24 permutations, this can be easily done. 

Now, we consider the word (wg),^ and write it to the first n memory cells using the 
Rivest-Shamir scheme. Notice that after this second write, the weight of the word stored in 
the first n memory cells is at most 

1 , , 2 , , 1 , , 1 3 

- ■ 4(1 — pjm X 3 + - ■ 3(1 — pjm x 2 + - ■ 3(1 — pjm x 1 + - ■ Apm x + - ■ 4pm x 1 

= (8 - hp)m = (8 - hp)n/l2 , 

The additional two coordinates have no affect on the asymptotic rate. 
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where the term i-3(l— p)mxl comes from the contribution of the coordinates in I={{w2)no] 
Thus, in the third write we can store (4 + 5p)n/12 — o{n) bits. The total rate is thus 

(1/3) ■ (Hip) + il-p) log(3)) + (2/3) + (4 + 5p)/12 - o(l) . 

Maximizing, we get that for p ^ 0.442 the rate is larger than 1.809. 



The proof of Theorem 1.2 easily follows from the construction above. 



8.1 Discussion 

The construction above yields 3-write WOM codes that have rate that is e close to 1.809 



for block length roughly exp(l/e). In Theorem |1.1| we showed how one can achieve capacity 
for the case of 2- write WOM codes with such a block length. In contrast, for 3-write WOM 
codes over the binary alphabet the capacity is log(4) = 2. Thus, even with a block length 



of exp(l/e) we fail to reach capacity. As described in Section lOI we can achieve capacity 



by letting the block length grow like exp(exp(l/e)). It is an interesting question to achieve 
capacity for 3-write WOM codes with a shorter block length. 

An important ingredient in our construction is the Rivest-Shamir encoding scheme. Al- 
though this scheme does not give the best 2-write WOM code we used it as it is easy to 
analyze and understand the weight of the stored word after the second write. It may be possi- 
ble to obtain improved asymptotic results (and perhaps even more explicit constructions) by 
studying existing schemes of 2-write WOM codes that beat the Rivest-Shamir construction. 
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